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1. Introduction 

The one variable secant method, for the solution of equations, 
has been known for a very long time as being computationally more 
efficient than Newton's method. Among the extensions of this method to 
n-dimensional problems, those proposed by Wolfe [8 3 a nd Barnes [2 ], 
are among the most interesting ones, because, when they do converge, 
lh<2y axe tfanSidsiz&lLy 'sacaifc. af-f ifosr* Newta-n 5. 

method (see, for example, the discussion in [5 ]). However, as can be 
seen from the counter examples quoted in [5 ], these methods may not 
converge. More recently, Ritter [7 ] proposed a new algorithm for 
function minimization, combining Goldstein 
with a secant type method, which contains 
direction of descent and the gradient) to 
angle test depends on a parameter that may 
in advance. A bad selection results in th 
gradient mode most, if not all the time. 
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In this paper we present a new gradient-secant algorithm for 
unconstrained optimization problems of the form min {f(z) | z S R n }. 

It differs from Ritter's method both in the fact that it does not use 
an angle test and in the manner in which it updates the approximate 
hessian. Roughly speaking, in solving a problem, this algorithm uses 
Armijo gradient method iterations [ 1] until it reaches a region where the 


Newton method is more efficient than the gradient method. Then it 
switches over to a secant form of operation. Under the assumption that 
£ is continuously differentiable, we have shown that any accumulation 
point z, of a sequence constructed by this algorithm, must be 
stationary. Under the stronger hypothesis that f is twice continuously 
differentiable and strictly convex, we were able to show that any 
sequence {z,}°° constructed by our algorithm converges super linearly 
to the unique^inimizer z of f(-), with rate t\ where t is tne unique 
positive root of t n+1 - t n - 1 = 0, i.e. that for some 6 £(0,1) and 


■ \ I! C II ■<: r fl n i=n 1 2. .. . Both theoretical 

some Re(O,«0, U z ± - z H <_ R 0 , l u, i, z, 

considerations and our computational experiments indicate that this new 
algorithm is considerably faster than the Newton method, and Lootsma 
[4], reports that on many problems Newton's method is superior to 
a number of conjugate direction and quasi-Newton methods. It is 


therefore not unrealistic to hope that, as experience with the new method 
accumulates, it will emerge as one of the most efficient methods for 
the solution of certain classes of unconstrained optimization problems. 
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2. The Secant Method. 

Consider the problem 

1. min{f(z) | z G lR n } 

To begin, we shall make only the following minimal assumptions. 

2. Assumptions : (a) f is continuously differentiable and, (b) f.is bounded 

from below. n 

Throughout this paper, when we say that an algorithm is convergent , . 
we mean that every limit point z of a sequence it constructs in solving 
(1) satisfies f(z) = 0. 

The assumptions (2) will suffice to prove that the algorithm we are 
about to. state is convergent. We shall later show, under stronger 
assumptions, that it converges superlinearly and establish a bound on 

We shall use the notation 

3. g(z) = 7f(z), z € IR U . 

4. Algorithm; 

Data: 6 > 0, a G (0, |) , 3 G (0,1) , b > 0 (large) T ,Jl >_ 2, z Q G IR n , 

H a symmetric positive definite n x n matrix, e^ = j— column of n x n 
unit matrix, j = 1, 2, .., n. 

Set i = 0, j = 0, p = 0, v = 6, H = H. Compute g( z Q ) and 

v Q = II g(z Q ) I' 2 - 

Compute g(z^). Stop if g(z^) =0. 

Step 2: If j < n, set'j = j + 1 and go to step 3; else set j = 1 and 

t The purpose of the constant b is to make the algorithm use a steepest 
descent step whenever H. , the current approximation to the hessian of f(*) 
is "too close" to being 1 singular . A lower bound on b is b >_ 2 I H(z)~l II 
for all z which are local minimizers of f(°). In practice, setting 
b = c° does not appear to destroy the convergence of the algorithm. 


Step 0; 
set y Q = 
Step 1: 
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go to step 3. 


Step 3: Set e ± = minCd^}. 

Step 4: Compute g (z^ + e_. ) . 

til — 

Step 5: Replace h , the j— column of H, by 

5. A = j [g(z i + e ± e ) - g(z i )] 

i J 

— — t 

to obtain a new matrix H, and set H. = H. 

Step 6; If llg(z^) II <_ y , go the step 7; else set W = z^ go to step 15 

Step 7: If lb ^ exists and li lb il <_ b , compute 

6. v ± - H' 1 g(z.) 

and go to step 8; else set w = z^ and go to step 15. 

Step 8; If ( v , gCz^ > < 0, go to step 9; else set w = and go to 

1 q 

•'*» w wj. • 

Step 9: Set k =0. 

1c 

Step 10: Compute f (z^ - 3 v_^) . 

Step 11; If 

7. f(z ± - 8 k v.) - fX Z; .) < 0 . • / 

go to step 13; else go to step 12. 

. Step 12: If k < l , set k = k + 1 and go to step 10; else go. to step 15 

k k 

Step 13: Compute g(z^ - 3 "b) . If g(z^ - 3 v^) = 0, 

1c 

set z.,, = z. - 3 v.- and stop, 

l+l l l 

Step 14: If 

8. II g(z i - 3 k v i )H 2 £ (1 - 23^a) II gCz^H 2 , 

1c 2 

set z M1 = z. - 3 v., set y = H g(z. ) II , set p = p+1, set i = i+1 
i+l l l p+1 i+l 

t Note that since H differs from H . . in only one column, H.''" = H 

new old l new 

can be obtained from H ^ by means of the standard updating formula. 
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and go to step 2; else set w = z ± - 3 v and go to step 15. 

Step 15: Compute the smallest integer s_^.>_ 0 such that 
s s 

9. f(z i - 3 1 g(z i )) - f( z i ) £ - B 1 a I! g(z i ) II 2 

s . 

and set y = z ± - 3 g(z/) . 

Step 16: If f(y) < f(w), set z ^ = y, set i = i+1, and go to step 1; 

else set z.., = w, set i = i+1 and go to step 1. n 
l+l 

Since when g(z ) Y 0, one can always find a finite s^ such that (9) 
is satisfied, algorithm (4) is obviously well defined. 


10. T heorem : Suppose that the assumptions (2) are satisfied and that 

00 

algorithm (4) has constructed an infinite sequence { z ‘}._q.. Then every 

00 

limit point z* of {z^}^ = q satisfies g(z*) =0. 

Proof : Suppose that z* is a limit point of {z^}, that g(z*) f 0 and that 


. ■■■ i* for i <z K, witu K au xuxluxte suuset ox Lae positive integers, 

l 

Now there are two possibilities. 

(i) There exists an infinite subset K’ C K such that for all i £ K 1 , 
either 


11 • Z i+1 = z i " 3 8(z i ) 


or 


O ^ 

z.,_ = z. - 3 v. 
i+I i i 


12 . 


13 , 


and 


f(z i - 3 k v ± ) < f(z ± - 3 1 g(z i ). 


Since g(z*) ^ 0 and z ^ -*■ z* for i £K', it follows from the discussion in 
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sec. 2.1 of [ 6] (see theorem 22 and algorithm 35) that there exists 
a <5(z*) < 0 and an integer N such that 


14. f (z i+1 ) " f(z ± ) 1 6(z*) < 0 for all i >_ N, i^K'. 

oo 

But K* is an infinite subset, and (f(z^)}^_Q is a monotonically 
decreasing sequence, hence (14) contradicts the assumption that f(z) is 
bounded from below. Thus, if (i) holds, then g(z*) = 0. 

The second possibility is 

(ii) There exists an infinite subset K" C K such that 

15. z.' = z. - g k v. for all i E K" 

l+l l i 

and 

~ A 2 ■ 

16. II g(z i+1 )« Z < (1-26 °) II g(z.) II for all i£f 

In this case the sequence {y^} is infinite, monotonically decreasing and 

bounded from below by zero. Hence y -*■ y* > 0. Now, since whenever (15) 

P - 

2 

and (16) take place, y ^ • II §( z i+1 ) ^ » for some integer p, and 

1 y > II g(z ) II 2 ,y - y < y ,-li g(z.) II 2 < - 2 II g(z .) II 2 . Hence, 

'p — i ’ p+1 p — p+1 l — 1 - 

z -* z* for i E K" , and since g(*) is continuous by assumption (2), 

there exists an infinite subset K m of the positive integers such that 

17. y ... - y < II g(z*) II 2 for all p E K". 

p+1 P - 

But (17) contradicts the fact that y^ -> y* 0. 

Hence we must have g(z*) =0. n 

The following result is a direct consequence of theorem (10) . 

18. Corollary; Algorithm (4) is convergent whenever problem (1) satisfies ■ 


since 
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the assumptions (2) , p 

The following corollary can be deduced from theorem (10) , which 
implies that g(z^) =* 0 as !'=*•?? and hence that z^_^ - 0 as i =* “» 

and theorem (1.3.66) in [ 6], 


CO 

Corollary: Suppose that the sequence (z^^. described in theorem 

(10) is compact and that the function f(?) has only a finite number 
of stationary points , fhen there exists a z* £ !R such that z^ z* 
and g(z*) s= 0. n 

We are now ready to establish the rate of convergence of algorithm 
(4). For this purpose we shall need to assume the following. 


Assumption: The function f is (a) three times continuously differentiable 


and, (b) strictly convex. 


Note that under assumption (20), the level sets of f(-) are 


compact and there exists only one point z ( the minimizer of f(z) 

over 11 n ) , which satisfies g(z) = 0. Hence., by theorem (10) and 

corollary (19) , whenever assumption (20) is satisfied, any sequence 

(z }°° - constructed by algorithm (4) converges to the unique minir- 
i a=0 

mizer z of f (♦) . 


Lemma: Suppose that assumption (20) is satisfied and that algorithm (4) 

has constructed an infinite sequence converging to z, the 

mfnimizer of f (•) , Then there exists 0 < M < 05 such that 

f-n 

II H(z . ) - H . II < M II z. - z II for 1 = 0, 1, 2, fttt 

i i — <4=^ J 

j=i 

where H. is as defined in the algorithm and 
l 
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23. H(z) E , z E R n . 

dz 

Proof : Since (20) (b) is satisfied, the level set 

C(zq) = {z | f(z) < f(z^)} is compact and convex and hence, since (20) (a) is 

satisfied, there exists a Lipshitz constant L < co such that for all 
x, y £ C(z Q ) , 


24. II. H(x) - H(y) I <lh-yl 

(Note that {z.}°° _ is contained in C(z A )). Now, without loss of 

N i 1=0 0 

generality, suppose that the jth column of IL(j G {1, 2, ..., n}) is 

25. — — [g(z_^_ k + e ± _ k e -) - 8 ( z i_ k > 3 > where k G (0, 1 , 2, .., n-l}. 

£ i-k 1 1 ‘ 

Then, making use of (24), of the mean value theorem, and the fact that 

e , < II z. , - z. , , II by construction, we obtain that the magnitude 

l-k — i-k i-k-i 

of the difference between the j— - columns of H(z^) and Hk satisfies 


26. II H(z.) e 


i J e . . 
J i-k 


Cg(Z i-k + e *-t' e 4>" "'SC**..,,)] 


i-k j' 


'i-k' 


tH( Zi ) - H(z._ k + t.e._ k e.)] e. dt 


< L I ‘I z. — z. , - t e. , e . II dt 
— 1 i i-k i-k j 


<_ L ? (II z ± 


z. . II + t II z. - z II) dt 

i-k i-k i-k-1 
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< L ( II z . - z II + 


z. . - z II + — 
i-k 2 


z i-k-r z[l) 


The existence of a constant M satisfying (22) now follows from (26) 
and the triangle inequality for norms, used in conjunction with the 
addition and subtraction of terms in the right hand side of ( 6) . n 


27. Lemma: Suppose that assumption (20) is satisfied, that b > 2 II H(z) 


-1 


and that the algorithm (4) has constructed a sequence (z.},_ , 

i 1 1— U 

Then there exists an integer N such that for all i >_ N, 

z. = z. - H. 1 g(z.) . 
l+l l l ° i 

Proof: First, since z^ z, the global minimizer of f ( *> ) , and (22) 


holds, it follows from the perturbation Lemma (2.3.2) in [ 5] that 
there exists an integer N* such that for all i _> N' exists and is 

positive definite and II H_. ^ II £b. Hence, for all i >_ N ' , the test in 
step 8 of the algorithm, i.e. , ( v^, g(z^) ) = - < g(z^), g(z^) > <0, 

is satisfied for all i >_N' and hence the computation proceeds to step 9 
Next, applying the second order Taylor expansion, we obtain (with 

V i = H i X 8( z ± )) 

28. f(z ± - H jL 1 g(z i )) - f(z ± ) = 


-1 I -1 -1 

- <g(z.), H. g(z.) > + J (1-t) <H i g(z i ), H(z i -tv.) H. X g(z.) > dt 

0 


= - <g(z i ), H i 1 g(z ± ) > + | (1-t) [<H i 1 g(z i ), H(z) H i ' L g(zj > 

0 
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+ <H. 1 g(z.),(H( Z;i -t v.) - H(z) ) H" 1 g(z.) > ] dt 

-1 -1 - _l 

Hence, since v i = H i g( Z;L ) + 0 as i » (because H^ -* H(z) 

and g(z.) -> 0 causing H(z. - v.) - H(z) -> 0 uniformly for t G [0,1] 

as i °°), and since H(z) h" 1 -> I, the unit matrix, as i •> “, it follows 

from (28) that there is an integer N" _> N* such that 

29. f(z ± - H" 1 g ( z i ) ) - f(z i ) < 0 for all i >_N", 

i.e. the test in step 11 of the algorithm is satisfied with k = 0 for all 
i _> N" . 

1 2 

Finally, consider ~ II g(z)H . Since f is three times continuously 
differentiable, 

2 

30. . -K- (4 II e ( z ) II = H(z) T H(z) + W(z) 

3z Z * 

where W(z) is a continuous nxn matrix which satisfies W(z) = 0. Hence, 

"“12 

for all i >_ N" , expanding II gU.j-IlT g( Zi ))H to second order terms 
according to the Taylor formula, we obtain 

31. II g(z . -hT 1 g(z . ) ) II 2 = H g(z )H 2 - 2 < H(z ) T g(z ), H 1 g(z ) > 

111 1 X -L -L -L 

1 

1 g(z ± ) > 

0 

+ (h" 1 g(z i ), W^-tv^ H i 1 g(z ± ) > ]dt, 

Where v. = hT 1 g(z.). Setting H.(t) = H(z -tv ) and W (t) = W(z -tv.), 
l i ° x x ix i. x x , 

t €E'[0,1], (31) yields 


-1 


+ 2 1 (1-t) 1<H ± g(z ± ) 


H(z.-t v.) 


H(z.-tv.) H. 

I X X 
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Since v, -»• 0 as i -*■ °° and z . -> z as i -> W.(t) 0 as i -*■ <», 

uniformly in t £ [0,1] and similarly, HL(t) H . ^ ->-1 as i uniformly in 

t £ [0,1]. Hence the term in the right hand side of (32), multiplied 

by .1 g(z i ) II tends to zero as i -*■ 00 and therefore there exists an integer 

N'">^N" such that the test (8) is satisfied for k = 0 for all i >_ N"' . 

Now, since g(z.) + 0 as i -> °°, there exists an integer N >_ N"' at 

which the test in step 6, viz. Jlg(z ) II . 2 <Y will be satisfied. Then, 

for all i > N, z. , = z. - hT 1 g(z.), which completes our proof. o 
— i+l lix 

33. Theorem: Suppose that assumption (20) is satisfied and that algorithm 

(4) has constructed a sequence { i -^ = q * Then 


0 < lim sup 


1/t 1 

z. - z II n < 1, 


where x n is the unique positive root of the equation t 1 ^ - t n - 1 - 0 

and z is the unique minimizer of f(*) (i.e., the R-order of algorithm (4) 


is x , where R-order is defined by (9.2.5) in [5]). 
n 

Proof: Let N be an integer such that for all i >_ N 
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35. 


Z i+1 


z . 

1 


H i X g(z i^ 


By Lemma (27), such an N exists. Then since H(*) is Lipshitz 
continuous on C(z Q ) , for all i >_ N (since g(z) = 0) 

36. II z ±+1 - z II = II (z.-z) - H 1 1 (g(z i )-g(z)) II 
1 

< II 1 (I-hT 1 H(z+t (z.-z)) (z.-z) dt II 

— S i l i 

0 

1 

< | II H^ 1 II II (H ± -H(zH-t (z ± — z) ) II Hz. - z II dt 
0 

1 li z.-z II dt 

“ i 

1 

<_ II 1L 1 II | [llH i -H(z j .) II + Lt II z ± -z II] H z ± -z II dt , 

0 


-1 


“ 1 


mi H.-Hfz.'t 

i i' 


H (z . 3 - H ( z+t ( z . -z3 3 II 

1 * 1 


where L is the Lipshitz constant for H(*) on C(z Q ) . Now making use of 

Lemma (22) and the fact that II H" 1 II is bounded for i>_ N 
— 1 A 

(since 1L •> H(z) ) , we conclude from (36) that there exists constants 
X >0, j=0, 1, 2, .., n-1, such that for all i >_ N 

j ” 


37. 


z . - z II < II z . - z 

1+1 — 1 


n 



3=0 


i-3 


- z 
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The desired result now follows from ( 7) and theorem (9.2.9) in [5 ]. n 

CO 

38. Corollary; Under the conditions in theorem (33), any sequence ^ z i > i==0 
constructed by algorithm (4) satisfies 

39. lim sup II g(z^)H 

40. lim sup [f(z i ) - f(z)] “ <1 

P roof ; Since g(z) = 0, 

1 ' 

II ^ H(z i +t(z i ~z) ) (z^-z) dt II 

A 

u 

1 

_< [ ^ II H.(z^+t (z^-z) ) II dt]H z ± - Z II 
0 

< Q II z.-z II 

— X X 

where Q = sup {II H(z) II | z 6 C(z Q )}. Relation (39) now follows from 

1/T n 

(41) , (34) and the fact that Q -* 1 as i -> “. 

Next, again since g(z) = 0, 



1/T n 


< 1 . 


■ 13 - 



f(z.) - f(z) =( (l-t)<.(z i -z),H(z+t(z i -z)) (z r z)> dt 


<_ 4 (1-t) Q B z ± - z B dt 


= — Q li z. - z II 2 
2 v i 

where Q is an upper bound on the eigenvalues of H(z) for z S C(z Q ) . 


1/2T 1 


1 - 1 */ ^ 

Relation (40) now follows from (42) , (34) and the fact that Cj Q) * 1 


as i -*■ 00 


. This completes our proof. 


... n- . .c x.v. ~ ~-.~--.rr 

We iiote ti'iat tue oxiry uiwb -we uiauC 


f Vi n +- i 

-ii - V / 


was three times continuously differentiable and strictly convex was in 

the proof of lemma (27). At this point is is easy to show that lemma (27) 

can also be proved under the weaker assumption that f(') is only twice 

continuously differentiable strictly convex and (24) holds. Thus suppose that 

{ z } is any sequence such that z. **■ z as i 00 an -d that 
i i=0- 1 ^ 

v = z - H -1 g(z ) for all i > N, where N is such that H. exists 
J i+1 i i i — 

for all i >_ N . Then (37) applies and yields 


y - Z II < II z . - Z 

y i+l - 1 


z - z II , for all i > N 
i-3 “ 


■ s by the mean value theorem, since g(z) - 0, 


g(y i+ l } II < \ 11 H(y 1+1 + s(y i+1 


H(y,. n + s(y, +1 - z ) |! ds 11 y i+ i 
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< Q II y. +1 -z 


n 


< Q II z^-z II II z^ ,-z II , 

3=0 


where Q is as in (41) . 


Now, 


45. II z.-z II II g(z.) II > (z.-z, g(z.)) 

l b l - l ’ ° i 


(z.-z. H(z.+s (z.-z) (z.-z))ds 
x i x i 


> q ii z.-z ir , 

— X 


where Q is as in (42) . Hence for all i > N 


46. II g(z.) II > Q II z.-z 
x — x 


and therefore (44) yields 


n 


47. II g(y.,:,) II < II g (z . ) II {— / II z. .-z 11} for all i > N 

l~rl — XU . X—J — 

j=o 


n 


Since / I! z. ,-z H-* 0 as i •> » . we conclude that there exists an 
x-j 

j=0 


integer N > N such that 


g(y . . ) II 2 < (l-2ag £ ) 1 g(z ) II 2 for all i > N 
x+1 — x — 
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and hence that lemma (27) holds under the weaker assumption that f is 
only twice continuously differentiable and strictly convex. 


Conclusion 

We have presented in this paper an efficient method for unconstrained 
minimization. It should be clear from the development that the 
assumptions used to establish rate of convergence can be relaxed from a 
global statement to a local one, i.e.' as holding in a convex neighborhood 
of a local minimum. It is also clear that one can construct several 
other variants of the algorithm as, for example, by substituting a 
conjugate directions method for the gradient method in the algorithm. 

In some applications these alternative, more complex versions may be 
preferred over the simplest one presented in this paper. - As long as one 


substitutes for tua Aretj v iiiuuuou cii.y u v c. u iuxiixiuj- 


zation method, the convergence and rate of convergence theorems, presented 


in this paper, remain valid. 
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